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CHAPTER  1: 
Introduction 


There  have  been  recent  high-profile  cases  of  weakness  in  pseudorandom  number  generators 
(PRNGs)  leading  to  cryptographic  flaws,  resulting  in  security  risks.  One  example  is  the 
compromise  of  group  chats  of  Cryptocat  due  to  a  programming  flaw  [1],  A  bug  in  the 
Biglnt  library  caused  the  random  number  generator  to  have  a  slight  bias  when  generating 
keys,  making  it  possible  to  crack  these  keys  within  a  day  [2].  Another  example  is  a  flaw 
with  OpenSSL’s  entropy  pool  on  Debian,  reducing  its  strength  from  256  bits  to  15  bits  [3]. 
It  resulted  in  generating  only  32,767  possible  secure  shell  (SSH)  keys  of  a  given  type  and 
size,  allowing  brute  force  to  be  a  practical  attack  on  the  key.  More  recently,  improper 
initialization  of  the  PRNG  led  to  android  digital  wallets  being  hijacked  [4]. 

For  military  systems  where  PRNG  design  and  evaluation  cannot  be  conducted  openly,  test¬ 
ing  PRNGs  empirically  may  be  the  only  option  available  for  assessing  the  properties  of 
these  security-critical  components;  this  is  true  for  the  many  embedded  and  proprietary  sys¬ 
tems  employed  in  national  security  applications. 

PRNG  test  suites  provide  insight  and  metrics  for  these  security-critical  system  components. 
However,  to  date  there  have  been  no  good  performance  analyses  of  PRNG  test  suites.  Re¬ 
view  of  existing  statistical  test  suites  showed  that  they  employed  a  battery  of  efficiently 
implemented  tests,  utilizing  heavy  performance  optimization,  but  these  tests  were  run  in 
serial.  PRNGs  and  existing  statistical  test  suites  are  reviewed  in  Chapter  2. 

This  thesis  added  multi-threading  to  Dieharder,  a  well-known  PRNG  test  suite,  to  signif¬ 
icantly  speed  up  PRNG  testing  on  multi-core  systems.  The  modifications  to  Dieharder 
and  the  platform  used  to  conduct  the  experiments  are  described  in  Chapter  3.  To  im¬ 
plement  multi-threading,  two  different  libraries  for  threading  were  evaluated.  The  first 
implementation  used  a  Portable  Operating  System  Interface  (POSIX)  thread  pool  library 
implemented  by  Mark  Gondree  [5],  and  the  second  implementation  used  Open  Multi- 
Processing  (OpenMP)  [6].  The  experiments  with  these  various  approaches  to  improve 
performance,  and  their  results,  are  discussed  in  Chapter  4. 
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The  primary  contributions  of  this  thesis  are  as  follows: 

•  We  modify  Dieharder  to  support  multi-threading,  to  create  a  variant  test-suite 
( Dieharder-T ). 

•  We  integrate  Gondree’s  thread  pool  library  into  Dieharder-T  and  evaluate  perfor¬ 
mance  of  the  test  suite. 

•  We  integrate  OpenMP  into  Dieharder-T  and  and  evaluate  performance  of  the  test 
suite  with  both  static  and  dynamic  scheduling. 

•  We  find  that  thread  pool  and  OpenMP  performed  better  than  Dieharder.  OpenMP 
performed  similar  or  better  than  thread  pool  depending  on  the  number  of  threads 
used. 

•  We  conclude  that  for  Dieharder-T ,  OpenMP  with  static  scheduling  offered  the  best 
performance. 

•  We  propose  a  hybrid  scheduling  solution  for  Dieharder-T  that  utilizes  the  advantages 
of  static  and  dynamic  scheduling 

1.1  Organization 

The  thesis  is  organized  as  follows.  In  Chapter  2,  we  provide  the  formal  definition  for  PRNG 
and  discuss  a  few  existing  statistical  test  suites.  In  Chapter  3,  we  described  the  design  of 
the  multi-threaded  application  and  the  platform  used  for  the  experiments.  In  Chapter  4,  we 
discuss  the  experiments  performed  and  an  analysis  of  the  results.  In  Chapter  5  we  conclude 
and  summarize  future  work. 
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CHAPTER  2: 
Background 


In  this  chapter,  we  review  pseudorandom  number  generators  (PRNGs)  and  provide  a  formal 
definition.  We  discuss  a  few  existing  statistical  test  suites  before  going  into  more  details  on 
two  of  interest  to  this  study:  Diehard  and  Dieharder. 

2.1  Pseudorandom  Number  Generator 

A  PRNG  is  an  implementation  of  an  algorithm  that  generates  a  deterministic  sequence  of 
numbers  that  appears  to  be  random.  As  these  numbers  are  generated  using  an  initial  seed 
for  the  algorithm,  the  sequence  is  reproducible  with  the  seed.  This  makes  the  sequence 
of  numbers  deterministic  as  long  as  the  initial  seed  and  algorithm  used  are  known.  If  the 
sequence  of  numbers  produced  is  interpreted  as  a  sequence  of  bits,  the  PRNG  may  be 
called  a  deterministic  random  bit  generator  (DRBG).  For  the  output  of  a  generator  to  be 
deemed  as  pseudorandom ,  it  “should  be  indistinguishable  from  a  truly  random  sequence  to 
an  attacker”  [7]. 

There  are  multiple  applications  for  PRNGs,  notably  in  simulation  and  cryptography.  Bad 
PRNGs  can  cause  misleading  results  in  simulations  [8].  The  are  many  applications  of 
PRNGs  in  cryptography,  such  as  generating  keys,  initialization  vectors  and  nonces.  Low- 
entropy  sequences  in  these  applications  often  result  in  loss  of  security  and  attacks  against 
implementations  of  cryptographic  systems. 

For  a  random  sequence  of  numbers  to  be  usable  for  cryptography,  the  sequence  must  be 
deemed  uniformly  distributed  from  the  perspective  of  a  computationally-bound  adversary. 
Weakness  in  a  PRNG  can  lead  to  cryptographic  flaws,  resulting  in  security  risks.  One 
example  is  the  compromise  of  group  chats  of  Cryptocat  due  to  a  programming  flaw  [1].  It 
was  caused  by  a  bug  in  the  Biglnt  library  which  caused  the  random  number  generator  to 
have  a  slight  bias  when  generating  keys,  making  it  possible  to  crack  these  keys  within  a 
day  [2].  Another  example  is  a  flaw  with  OpenSSL’s  entropy  pool  on  Debian,  reducing  its 
strength  from  256  bits  to  15  bits  [3].  It  resulted  in  generating  only  32,767  possible  SSH 
keys  of  a  given  type  and  size,  allowing  brute  force  to  be  a  practical  attack  on  the  key.  More 
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recently,  improper  initialization  of  a  PRNG  led  to  android  digital  wallets  being  hijacked  [4] . 

Adopting  the  definition  of  Desai,  Hevia  and  Yin  [7],  we  define  a  PRNG  Q&  as  a  tuple  of 
algorithms,  Q&  -  i'K ,Q).  The  seed  generating  algorithm  %  takes  a  security  parameter  k 
as  input,  to  generate  a  key  K  and  an  initial  state  sq.  The  generation  algorithm  Q  generates 
the  next  state  Si  for  i  >  1  and  an  output  yi,  using  key  K,  the  current  state  Si-i  and  an 
auxiliary  input  t The  block  length  of  the  PRNG  is  the  length  of  the  PRNG  output  in  each 
iteration,  i.e.,  n  =  |  y, j  where  z/,-  is  a  sequence  of  bits  z//[0],  z/,-[  1], . . . ,  z/,[n  -  1], 

A  generator  is  considered  practical  if  this  sequence  of  bits  is  easy  to  generate.  A  generator 
is  considered  secure  (or  unpredictable)  if  this  sequence  of  bits  appears  indistinguishable 
from  random  to  any  computationally-bound  adversary.  This  notion  can  be  formally  de¬ 
scribed  in  several  ways,  but  two  common  notions  are  the  next-bit-test  and  Yao  ’s  statistical 
test  [9] .  In  the  former  notion,  no  polynomial-time  Turing  machine  has  significant  success 
in  observing  the  first  i  bits  of  a  sequence  and  accurately  predicting  bit  i  +  1  in  the  sequence. 
In  the  latter  notion,  no  statistical  test  represented  by  polynomial- sized  circuit  has  signif¬ 
icant  success  in  differentiating  the  first  i  bits  of  the  generated  sequence  from  a  z'-bit  long 
random  sequence.  Yao’s  theorem  shows  that  these  two  notions  are,  in  fact,  related:  a  collec¬ 
tion  of  z'-bit  sequences  “passes  the  next-bit-test  if  and  only  if  it  passes  all  polynomial-sized 
statistical  tests ”  [10]. 

It  is  often  much  easier  to  show  at  the  design-level  that  a  generator  is  provably  secure  in 
the  next-bit-test  sense.  The  notion  that  a  generator  is  secure  if  no  statistical  test  appears 
to  exist  differentiating  it  from  random,  however,  is  both  intuitive  and  natural.  As  a  result, 
statistical  test  suites  have  been  developed  which  may  be  used  to  validate  both  design  and 
implementation.  Statistical  test  suites,  however,  provide  a  much  weaker  guarantee  than  the 
notion  introduced  by  Yao,  which  covers  all  practical  statistical  tests  (including  those  yet  to 
have  been  imagined). 

2.2  Statistical  Test  Suites 

Many  statistical  tests  have  been  proposed  by  different  authors  for  testing  PRNGs,  and 
these  have  been  collected  into  statistical  test  suites.  For  example,  the  National  Institute 
for  Standards  and  Technology  (NIST)  statistical  test  suite  (STS)  is  a  statistical  test  suite 
released  in  2001,  designed  for  testing  cryptographically  secure  pseudorandom  number  gen- 
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erators  (CSPRNGs).  It  consists  of  16  simple  statistical  tests  of  non-randomness  in  binary 
sequences,  intended  as  a  baseline  and  reference  implementation  for  statistical  testing  [11]. 
In  2007,  L’Ecuyer  and  Simard  released  TestUOl  for  statistical  testing  of  PRNGs  [12],  pro¬ 
viding  some  tests  not  previously  implemented  in  other  test  suites.  Next,  we  discuss  two 
suites  in  more  detail,  as  they  are  most  relevant  to  this  thesis:  Diehard  and  Dieharder. 

2.2.1  Diehard  Test  Suite 

One  of  the  most  popular  statistical  test  suites  is  Marsaglia’s  Diehard  battery  of  tests,  re¬ 
leased  in  1996.  The  suite  includes  15  tests,  written  in  C,  translated  from  Fortran  via  the 
f2c  utility.  Detailed  test  descriptions  are  released  with  the  test  suite  and  not  reproduced 
here  [13],  but  the  list  of  tests  is  reproduced  in  Table  2.1. 

The  tests  require  an  input  binary  file  of  32-bit  integers  to  represent  the  sequence  of  random 
numbers  [14].  For  input  files  shorter  than  the  required  length  for  each  test,  the  test  will  run 
until  the  end  of  the  file,  output  an  “END  OF  FIFE”  message  and  skip  to  the  next  test.  All 
tests,  except  the  Runs  Test,  require  various  parameters  such  as  sample  size,  bit  patterns,  etc. 
These  parameters  have  been  preset  to  allow  users  to  employ  the  test  suite  with  ease  and  are 
not  configurable  by  the  user.  If  a  user  wishes  to  customize  these  parameters,  they  would 
have  to  modify  the  source  code  and  recompile  the  test  suite. 

2.2.2  Dieharder  Test  Suite 

Brown  developed  the  Dieharder  test  suite  as  a  GNU-licensed  reimplementation  of  the 
Diehard  test  suite  [15].  Dieharder  tests  are  rewritten  C  code  based  on  test  descriptions 
from  Dieharder  and  NIST  STS.  It  also  includes  additional  tests  developed  by  Robert  G. 
Brown  and  David  Bauer.  The  test  suite  was  named  Dieharder  both  as  a  movie  sequel  pun 
as  well  as  a  tribute  to  George  Marsaglia,  author  of  the  Diehard  test  suite. 

Diehard  and  Dieharder  are  significantly  different.  The  former  uses  only  binary  file  for 
input  to  testing,  requiring  in  the  range  of  ten  million  random  numbers  [14];  the  latter,  how¬ 
ever,  prefers  test  generators  written  using  the  GNU  scientific  library  (GSF)  interface  as  to 
receive  an  unbounded  stream  of  random  numbers.  Currently,  80  well-known  generators  are 
supported,  many  of  which  are  drawn  directly  from  the  GSF.  Support  for  reading  raw  input 
through  a  file  is  wrapped  in  a  GSF-interface,  also.  The  rationale  for  using  this  generator- 


5 


like,  streaming  API  is  to  support  larger  sequence  of  random  numbers.  Modem  applications 
may  require  much  more  than  10 18  random  numbers  generated  from  millions  of  seeds;  these 
may  be  sensitive  to  random  number  generator  (RNG)  weaknesses  that  might  not  be  discov¬ 
ered  by  sequences  limited  to  107  random  numbers.  As  of  writing,  the  current  version  of 
Dieharder  (3.31.1)  has  31  fully  implemented  tests  [16]  (see  Table  2.1),  including  tests  from 
Diehard ,  NIST  STS,  tests  designed  by  Brown  and  Bauer,  as  well  as  popular  tests  from  other 
sources. 


2.3  Test  Suite  Design 

There  are  many  parameters  used  in  Dieharder  that  would  reasonably  be  used  in  other  tests 
suites  as  well.  These  parameters  are  used  to  control  how  many  random  values  are  tested 
in  each  individual  test  (psamples,  tsamples,  multiple_p),  controlling  the  generators  (seed, 
strategy)  and  control  test  (Xtrategy,  Xoff,  entity  count,  ksflag).  Some  of  the  parameters  [17] 
used  in  Dieharder  are  explained  here  to  understand  how  they  affect  the  test  results. 

•  psamples:  Number  of  p-value  samples  per  test. 

•  tsamples:  Number  of  trials  used  in  each  test. 

•  multiply _p:  Multiply  the  number  of  psamples  for  each  test  by  a  constant  amount. 

•  seed:  Initial  seed  value  for  PRNG. 

•  strategy:  Reseeding  strategy.  The  default  (0)  only  reseeds  at  the  start  of  program; 
non  zero  reseeds  at  start  of  each  test. 

•  xtrategy:  Strategy  for  when  to  stop  test.  The  default  (0),  runs  tests  with  a  specified 
number  of  psamples  and  tsamples;  resolve  ambiguity  mode  reruns  the  test  until  am¬ 
biguity  is  resolved  by  adding  psamples;  test  to  destruction  mode  reruns  the  test  until 
failure  or  reached  max  psamples. 

•  xoff:  Max  number  of  psamples  to  determine  test  is  ‘good’ . 

•  ksflag  :  Which  Kolmogorov-Smimov  test  type  to  run.  The  default  (0)  is  “fast  but 
slightly  sloppy”  for  psamples  >  4999.  A  much  slower  but  more  accurate  mode  (1)  is 
available  for  larger  number  of  psamples,  or  (2)  a  very  slow  mode  that  is  accurate  to 
machine  precision. 

•  ntuple:  Set  the  ntuple  length  for  tests  on  short  bit  strings  that  permit  the  length  to  be 
varied. 
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Table  2.1.  Comparison  of  Tests  Available  in  Diehard  and  Dieharder. 


Diehard 

Dieharder 

Birthday  Spacings 

Diehard  Birthday  Test 

Overlapping  Permutations 

Diehard  OPERM5  Test 

Ranks  of  31x31  and  32x32  Matrices 

Diehard  32x32  Binary  Rank  Test 

Ranks  of  6x8  Matrices 

Diehard  6x8  Binary  Rank  Test 

Monkey  Tests  on  20-bit  Words 

Diehard  Bitstream  Test 

Diehard  OPSO  Test 

Monkey  Tests  OPSO,  OQSO,  DNA 

Diehard  OQSO  Test 

Diehard  DNA  Test 

Count  the  l’s  in  a  Stream  of  Bytes 

Diehard  Count  the  l’s  (stream)  Test 

Count  the  l’s  in  Specific  Bytes 

Diehard  Count  the  l’s  (byte)  Test 

Parking  Lot  Test 

Diehard  Parking  Lot  Test 

Minimum  Distance  Test 

Diehard  Minimum  Distance  (2d  Circle)  Test 

3D  Spheres  Test 

Diehard  3d  Sphere  (Minimum  Distance)  Test 

The  Squeeze  Test 

Diehard  Squeeze  Test 

Overlapping  Sums  Test 

Diehard  Sums  Test 

Runs  Test 

Diehard  Runs  Test 

Craps  Test 

Diehard  Craps  Test 

- 

Marsaglia  and  Tsang  GCD  Test 

- 

STS  Monobit  Test 

- 

STS  Runs  Test 

- 

STS  Serial  Test  (Generalized) 

- 

RGB  Bit  Distribution  Test 

- 

RGB  Generalized  Minimum  Distance  Test 

- 

RGB  Permutations  Test 

- 

RGB  Lagged  Sum  Test 

- 

RGB  Kolmogorov-Smirnov  Test 

- 

DAB  Byte  Distribution 

- 

DAB  DCT 

- 

DAB  Fill  Tree  Test 

- 

DAB  Fill  Tree  2  Test 

- 

DAB  Monobit  2  Test 
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CHAPTER  3: 
Methodology 


This  chapter  will  discuss  in  detail  on  the  methodology  used  for  this  thesis.  The  design  of 
the  multi-threaded  application  and  platform  used  for  the  experiments  is  described. 

3.1  Program  Design 

Dieharder  version  3.31.1  was  used  as  the  code  base  to  implement  a  modified  multi-threaded 
Dieharder  test  suite  named  Dieharder-T.  The  main  code  modifications  were  the  way  indi¬ 
vidual  statistical  tests  were  scheduled  to  run,  and  modification  to  individual  test  logic  to 
support  parallel  processing. 

Each  statistical  test  from  Dieharder  is  contained  within  a  single  file  that  implements  the 
base  class  Test.c.  It  contains  the  statistical  test  logic,  input  parameters  as  well  as  the  test 
results  (outputs).  Dieharder-T  separated  the  test  logic  and  input  parameters  to  ITest.c,  and 
the  outputs  of  the  test  to  OTest.c. 

Dieharder  includes  an  “all-tests”  mode  to  allow  a  convenient  way  to  benchmark  the  PRNG 
being  tested.  This  benchmark  mode  is  more  than  just  running  all  31  tests  in  the  test  suite. 
It  includes  running  various  statistical  tests  by  Brown  with  varying  ntuple  values  to  compre¬ 
hensively  stress  test  the  generator.  The  list  of  tests  and  corresponding  ntuple  values  used  in 
this  mode  is  shown  in  Table  3.1.  A  total  of  80  statistical  tests  were  ran  when  not  running 
in  resolve  ambiguity  (RA)  or  test  to  destruction  (TTD)  Xtrategy  modes,  which  might  in¬ 
crease  the  number  of  statistical  tests  to  run  depending  on  test  results  as  explained  in  Section 
2.3.  To  enable  a  fair  comparison  between  Dieharder  and  Dieharder-T ,  Dieharder-T  will 
execute  the  exact  same  80  statistical  tests  when  running  in  this  benchmark  mode. 

To  implement  multi-threading  in  Dieharder-T ,  two  different  libraries  for  threading  were 
evaluated.  The  first  implementation  used  a  POSIX  thread  pool  library  implemented  by 
Mark  Gondree  [5],  and  the  second  implementation  used  OpenMP  [6].  As  the  tests  do  not 
depend  on  results  of  other  tests,  they  can  be  executed  in  parallel  without  altering  the  test 
results.  In  order  to  validate  test  results  against  Dieharder ,  the  random  numbers  provided  to 
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both  Dieharder  and  Dieharder-T  have  to  be  the  same  to  generate  the  same  p-value.  Since 
the  sequence  of  tests  executed  in  Dieharder-T  would  be  different  from  Dieharder  due  to 
parallelization,  the  PRNG  would  be  required  to  be  reseeded  at  the  beginning  of  each  test  to 
ensure  that  the  same  tsamples  are  provided  to  both  Dieharder  and  Dieharder-T. 

3.1.1  Dieharder-T 

Using  the  thread  pool  library  by  Gondree,  a  job  will  be  created  for  each  statistical  test  to 
be  executed  and  added  to  a  common  thread  pool  queue.  Whenever  a  thread  in  the  thread 
pool  is  available,  the  next  job  in  the  queue  will  be  assigned  to  it.  When  all  jobs  in  the  queue 
have  been  completed,  the  experiment  will  terminate  successfully.  The  experiment  that  uses 
the  thread  pool  library  will  be  referred  to  as  Dieharder-T  in  the  following  chapters. 

3.1.2  Dieharder-T-OMP 

OpenMP  provides  compiler  directives,  library  routines  and  environment  variables  to 
achieve  parallelism  in  the  program.  The  statistical  tests  will  be  executed  in  a  parallel 
for  loop  directive,  allowing  OpenMP  to  schedule  the  tests  to  different  threads.  By  default, 
OpenMP  uses  a  static  schedule  that  assigns  loop  iterations  to  threads  for  execution.  In  a 
4-thread  application  with  80  loops,  static  scheduling  will  assign  loops  0  to  19  to  thread 
ID  0,  loops  20  to  39  to  thread  ID  1,  loops  40  to  59  to  thread  ID  2  and  lastly  loops  60  to 
79  to  thread  ID  3.  This  scheduling  policy  favors  tasks  that  have  similar  execution  time, 
which  might  not  be  suitable  for  the  test  suite  due  to  varying  execution  times  for  different 
statistical  tests,  which  will  be  referred  to  as  Dieharder-T-OMP -S.  Another  type  of  schedul¬ 
ing  available  in  OpenMP  is  Dynamic  scheduling.  This  scheduling  mode  will  assign  a  loop 
iteration  to  an  available  thread,  similar  to  how  the  thread  pool  library  works.  It  allows  a 
more  balanced  execution  time  across  threads,  but  incur  a  higher  processing  overhead  as  it 
requires  the  thread  to  wait  after  each  task  to  receive  the  next  iteration  to  execute.  The  trade¬ 
off  between  higher  overhead  and  balanced  work  load  will  be  analyzed  in  the  next  chapter. 
Dynamic  scheduling  will  be  referred  to  as  Dieharder-T-OMP-D  in  the  following  chapters. 

3.2  Experiment  Platform 

The  experiments  were  conducted  on  Amazon  Web  Services  (AWS)  Elastic  Compute  Cloud 
(EC2)  [18]  to  provide  a  reproducible  platform  for  anyone  that  is  interested  to  validate  the 
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experiment  results.  The  deployment  platform  was  a  c3.xlarge  [19]  compute  optimized 
instance  with  4  virtual  CPU  cores.  The  instance  used  was  not  a  dedicated  instance  and  the 
physical  hardware  was  shared  among  different  users.  As  a  result  of  hardware  sharing,  there 
were  some  high  variance  in  experiment  execution  time  between  repetitions  of  the  same 
statistical  test.  The  affected  experiments  were  repeated  to  verify  experiment  execution 
time. 

The  experiments  were  set  up  using  a  shell  script  to  vary  the  parameters  used,  and  repeat 
individual  tests.  The  Linux  time  utility  [20]  was  used  to  measure  the  time  elapsed  for  each 
test,  and  the  real  (wall  clock)  time  was  used  to  measure  execution  time.  This  provided  a 
fair  comparison  between  a  multi-threaded  application  and  single  thread  application  as  it 
measured  the  absolute  time  taken  to  execute  each  experiment. 

To  generate  the  input  file  for  testing  of  the  binary  file  generator  for  Dieharder,  we  used  the 
/ dev/urandom  on  the  AWS  EC2  to  generate  a  1  terabytes  (TB)  binary  file  on  an  Elastic  Block 
Store  (EBS)  scl  2  TB  volume  [21]  to  be  attached  to  our  EC2  instance,  /dev/urandom  was 
used  instead  of  /dev/random  due  to  the  blocking  nature  of  /dev/random  when  the  entropy 
pool  is  empty  [22].  / dev/urandom  was  used  in  the  experiments  as  it  was  the  recommended 
PRNG  between  the  two  PRNG  except  when  used  for  long-lived  keys  [22],  [23]. 

3.3  Reproducibility 

In  order  to  verify  that  the  multi-threaded  statistical  test  suite  results  were  accurate,  its  re¬ 
sults  must  be  validated  using  Dieharder  test  results.  To  have  Dieharder  and  Dieharder-T 
executing  the  exact  tests,  the  initial  seed  used  for  PRNG  must  be  explicitly  declared  so  that 
they  are  not  randomly  generated. 

As  random  numbers  are  assigned  in  a  block  at  the  start  of  the  test,  the  order  of  the  tests 
execution  will  affect  the  block  of  random  numbers  assigned  in  normal  operation  of  the 
PRNG.  Test  options  such  as  RA  or  TTD  Xtrategy  modes  mentioned  in  Section  2.3  will 
also  increase  the  number  of  random  numbers  needed  for  individual  tests  that  cannot  be 
determined  until  the  test  is  completed.  Due  to  the  nature  of  multi-threaded  application 
executing  statistical  tests  in  parallel,  the  order  of  assigning  the  block  of  random  numbers 
will  be  different  from  that  of  Dieharder ,  which  executes  the  statistical  tests  sequentially. 
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To  have  a  fair  comparison  of  execution  time  and  to  verify  that  Dieharder-T  was  imple¬ 
mented  correctly,  the  experiments  performed  in  this  thesis  were  initialized  with  the  same 
initial  seed  that  were  reseeded  at  the  start  of  each  statistical  test. 


12 


Table  3.1.  List  of  Tests  and  ntuple  Values  for  “All-Tests”  (Benchmark)  Mode 


Test 

ntuple  values 

Diehard  Birthday  Test 

0 

Diehard  OPERM5  Test 

0 

Diehard  32x32  Binary  Rank  Test 

0 

Diehard  6x8  Binary  Rank  Test 

0 

Diehard  Bitstream  Test 

0 

Diehard  OPSO  Test 

0 

Diehard  OQSO  Test 

0 

Diehard  DNA  Test 

0 

Diehard  Count  the  l’s  (stream)  Test 

0 

Diehard  Count  the  l’s  Test  (byte) 

0 

Diehard  Parking  Lot  Test 

0 

Diehard  Minimum  Distance  (2d  Circle)  Test 

0 

Diehard  3d  Sphere  (Minimum  Distance)  Test 

0 

Diehard  Squeeze  Test 

0 

Diehard  Sums  Test 

0 

Diehard  Runs  Test 

0 

Diehard  Craps  Test 

0 

Marsaglia  and  Tsang  GCD  Test 

0 

STS  Monobit  Test 

0 

STS  Runs  Test 

0 

STS  Serial  Test  (Generalized) 

0 

RGB  Bit  Distribution  Test 

1  -  12 

RGB  Generalized  Minimum  Distance  Test 

2-5 

RGB  Permutations  Test 

2-5 

RGB  Lagged  Sum  Test 

0-32 

RGB  Kolmogorov-Smirnov  Test 

0 

Byte  Distribution 

0 

DAB  DCT 

0 

DAB  Lill  Tree  Test 

0 

DAB  Lill  Tree  2  Test 

0 

DAB  Monobit  2  Test 

0 
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CHAPTER  4: 
Experiments  and  Analysis 


This  chapter  will  discuss  the  experiments  performed  in  this  thesis  and  an  analysis  of  the 
results.  The  first  three  experiments  were  conducted  to  narrow  down  the  parameters  to  be 
used  for  the  other  experiments.  The  next  two  experiments  were  conducted  to  compare  the 
performance  of  Dieharder  and  Dieharder-T.  The  last  two  experiments  were  conducted  to 
evaluate  the  effects  of  OpenMP  scheduling  policy.  The  full  results  can  be  found  in  the 
Appendix. 

4.1  PRNG  Source 

This  experiment  was  designed  to  compare  the  two  different  methods  for  passing  random 
numbers  to  Dieharder.  The  experiment  measured  the  execution  time  for  running  the  “all¬ 
tests”  benchmark  mode  using  file-based  random  number  input  and  unbounded  random 
number  stream.  The  two  different  PRNGs  selected  for  the  experiment  were  Mersenne 
Twister  and  raw  file  input  as  described  in  Section  3.2.  To  determine  if  the  run-time  taken  for 
“all-tests”  benchmark  mode  increases  linearly  when  number  of  psamples  were  increased, 
the  number  of  psamples  were  varied  by  changing  the  multiply _p  value.  The  experiment 
also  helped  to  select  the  PRNG  for  the  other  experiments.  The  parameters  used  for  this 
experiment  are  listed  in  Table  4.1. 


Table  4.1.  List  of  Parameters  for  PRNG  Comparison 


Parameter 

Value 

Test  Suites 

Dieharder 

Xtrategy  Mode 

Normal  (0) 

KS  Test 

Default  (0) 

Tests 

All-tests  (a) 

Generators 

Mersenne  Twister  (13),  Raw  File  Input  (201) 

Multiply  P 

1,  2,  4,  8 

The  run-time  for  Mersenne  Twister  and  raw  file  input  generators  are  shown  in  Figure  4. 1 
and  the  full  results  are  shown  in  Appendix  A.l.  The  results  show  a  consistent  rate  of  in¬ 
crease  for  the  run-time  when  multiple_p  is  increased:  the  run-time  doubled  when  psamples 
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was  doubled  for  the  Mersenne  Twister.  The  total  run-time  of  raw  file  input  generator  was 
much  higher  compared  to  the  Mersenne  Twister  generator.  The  Mersenne  Twister  gener¬ 
ator  was  selected  as  the  generator  to  be  used  for  the  rest  of  the  experiments  as  it  would 
reduce  total  run-time  by  a  factor  of  7. 


Figure  4.1.  Dieharder  Run-time  for  Different  GSL  Generators 


4.2  Test  to  Destruction  Xtrategy  Mode 

This  purpose  of  this  experiment  was  to  estimate  the  time  taken  to  execute  TTD  Xtrategy 
mode  when  multiply_p  was  increased.  This  experiment  was  the  only  experiment  in  this 
thesis  that  did  not  use  the  “all-tests”  benchmark  mode,  as  testing  under  TTD  mode  ap¬ 
proached  more  than  a  week  to  complete  with  “all-tests”.  This  experiment  ran  “ Diehard 
bitstream  test”  with  the  parameters  shown  in  Table  4.2.  As  RA  and  TTD  modes  required 
the  Kolmogorov-Smirnov  test  (KSTEST)  flag  to  use  the  accurate  mode  (2)  [17],  all  three 
Xtrategy  modes  were  executed  in  KSTEST  accurate  mode  for  consistency. 
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Table  4.2.  List  of  Parameters  for  Test  To  Destruction  Run-time  Estimation 


Parameter 

Value 

Test  Suites 

Dieharder 

Xtrategy  Mode 

Default  (0),  Resolve  Ambiguity  (1),  Test  to  Destruction  (2) 

KS  Test 

Accurate  (2) 

Tests 

Diehard  Bitstream  Test  (4) 

Generators 

Mersenne  Twister  (13) 

Multiply  P 

1,2,  4,8 

From  the  results  as  shown  in  Figure  4.2,  it  can  be  concluded  that  the  run-time  for  TTD  was 
nearly  constant,  even  when  multiply _p  was  increased.  This  is  not  surprising  as  individual 
statistical  tests  were  repeated  with  psamples  in  increments  of  100  until  the  PRNG  failed 
the  test  or  reached  the  maximum  psamples  as  defined  by  the  xoff  parameter.  As  the  run¬ 
time  for  individual  statistical  tests  increased  with  increasing  psample  values,  the  run-time 
for  smaller  psample  values  only  contributed  a  small  percentage  of  total  run-time  for  the 
experiment.  Statistical  tests  with  multiply _p  value  of  2  (starting  psample  of  200)  performed 
only  one  less  repetition  (at  default,  psample  value  of  100,  xoff  value  of  100,000,  increment 
of  100)  resulting  in  insignificant  reduction  in  total  run-time  as  multipy_p  was  increased. 

The  results  also  showed  that  the  run-time  for  default  and  RA  Xtrategy  modes  was  much 
shorter  compared  to  that  for  TTD  mode.  As  shown  in  Appendix  A. 2,  the  run-time  for 
TTD  was  more  than  300  times  longer  when  compared  to  default  mode  when  multiply _p 
was  8.  Using  the  results  from  the  previous  experiment,  the  estimated  run-time  for  TTD  in 
“all-tests”  benchmark  mode  is  at  least  100  days.  Since  the  time  taken  to  run  TTD  would 
be  excessive,  no  further  experiments  were  executed  in  this  Xtrategy  mode.  The  other  two 
modes  were  then  evaluated  using  the  “all-tests”  benchmark  mode  in  the  next  experiment  as 
they  showed  similar  execution  time  for  just  one  statistical  test. 
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Figure  4.2.  Dieharder  Different  Xtrategy  Mode  Run-time  for  Bitstream  Test. 

Y-Axis  in  logarithmic  time  scale. 


4.3  Xtrategy  Mode  Selection 

This  goal  of  this  experiment  was  to  evaluate  the  default  and  RA  Xtrategy  modes  to  deter¬ 
mine  the  Xtrategy  mode  to  be  used  for  the  rest  of  the  experiments.  The  experiment  was 
executed  using  the  “all-tests”  benchmark  mode  with  the  parameters  listed  in  Table  4.3. 


Table  4.3.  List  of  Parameters  for  Xtrategy  Mode  Selection 


Parameter 

Value 

Test  Suites 

Dieharder 

Xtrategy  Mode 

Default  (0),  Resolve  Ambiguity  (1) 

KS  Test 

Accurate  (2) 

Tests 

All-tests  (a) 

Generators 

Mersenne  Twister  (13) 

Multiply  P 

1,2,  4,8 

As  shown  in  Figure  4.3,  both  Xtrategy  modes  had  similar  execution  time  except  when 
multiple_p  had  a  value  of  4.  At  multiply _p  value  4,  32  individual  statistical  tests  were 
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repeated  in  RA  mode  to  resolve  ambiguity,  resulting  in  the  additional  9  minutes  compared 
to  default  mode  as  shown  in  Appendix  A. 3. 


As  run-time  for  the  default  Xtrategy  mode  was  nearly  identical  but  less  variable  than  RA 
mode,  the  default  was  selected  as  the  Xtrategy  mode  for  the  rest  of  the  experiments.  The 
run-time  for  RA  is  expected  to  follow  closely  to  that  of  default  mode  when  there  are  few 
ambiguous  (weak)  results. 


Figure  4.3.  “All-Tests”  Run-time  for  Default  and  Resolve  Ambiguity  Modes 


4.4  Dieharder  and  Dieharder-T  Comparison 

This  purpose  of  this  experiment  was  to  compare  the  run-time  of  Dieharder  with  the  follow¬ 
ing  variants  of  Dieharder-T: 

•  Dieharder-T  in  serial  mode. 

•  Thread  pool  Dieharder-T  with  1  to  4  thread  counts. 

•  Static  scheduling  OpenMP  Dieharder-T  with  1  to  4  thread  counts. 

The  parameters  used  for  this  experiment  are  shown  in  Table  4.4. 
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Table  4.4.  List  of  Parameters  for  Statistical  Test  Suites  Comparison 


Parameter 

Value 

Test  Suites 

Dieharder,  Dieharder-T,  Dieharder-T-OMP-S 

Xtrategy  Mode 

Default  (0) 

KS  Test 

Default  (0) 

Tests 

All-tests  (a) 

Generators 

Mersenne  Twister  (13) 

Multiply  P 

1,2,  4,8 

As  shown  in  Figure  4.4,  Dieharder  had  the  longest  run-time  followed  closely  by  Dieharder- 
T  running  in  serial  mode.  This  showed  that  the  modifications  made  to  Dieharder  described 
in  Section  3.1  improved  code  efficiency,  reducing  run-time  as  multiply _p  was  increased. 

The  run-time  for  1 -thread  Dieharder-T  and  Dieharder-T-OMP-S  were  almost  similar  to 
serial  Dieharder-T  as  expected. 

The  run-time  for  2-thread  Dieharder-T  was  significantly  higher  than  2-thread  Dieharder-T- 
OMP-S,  probably  due  to  the  dynamic  scheduling  overhead  in  thread  pool.  The  run-time  for 
3-thread  and  4-thread  Dieharder-T  was  similar  to  that  for  2-thread  and  3-thread  Dieharder- 
T-OMP-S.  As  thread  pool  uses  a  thread  for  polling  of  job  statuses,  the  maximum  number 
of  central  processing  unit  (CPU)  cores  available  for  executing  statistical  tests  was  limited 
to  3  in  the  experiment  platform  with  4  virtual  CPUs.  This  resulted  in  3-thread  and  4-thread 
Dieharder-T  having  similar  run-time. 

The  4-thread  Dieharder-T-OMP-S  had  the  shortest  run-time  for  all  statistical  test  suites 
evaluated  and  was  16%  faster  than  3-thread  Dieharder-T-OMP-S.  The  3-thread  Dieharder- 
T-OMP-S  was  only  about  2%  faster  than  2-thread  Dieharder-T-OMP-S  and  is  examined 
in  greater  details  in  Section  4.5.  The  2-thread  Dieharder-T-OMP-S  took  almost  half  the 
time  of  1-thread  Dieharder-T-OMP-S  as  expected  since  the  tasks  were  split  between  two 
threads. 

The  results  showed  that  Dieharder-T-OMP-S  performed  the  same  or  better  than  Dieharder- 
T  depending  on  the  number  of  threads  used,  making  it  the  preferred  choice  for  implement¬ 
ing  multi-threading  for  the  PRNG  test  suite. 
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Multiply  Psamples  Factor 

-  -  HU  —  Dieharder  - ♦  -  -  ■  Serial  Dieharder-T 

4  1-thread  Dieharder-T  •  •  ■  -1-thread  Dieharder-T-OMP-S 

►  2-thread  Dieharder-T  •  ■  ■  -2-thread  Dieharder-T-OMP-S 

A  3-thread  Dieharder-T  -3-thread  Dieharder-T-OMP-S 

¥  4-thread  Dieharder-T  4-thread  Dieharder-T-OMP-S 


Figure  4.4.  “All-Tests”  Run-time  for  Different  Statistical  Test  Suites 
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4.5  Static  Scheduling  Individual  Test  Run-time 

The  goal  of  this  experiment  was  to  collect  the  run-time  for  individual  statistical  tests  in  each 
thread  for  2-thread  and  3-thread  Dieharder-T-OMP-S.  It  was  to  determine  if  the  similarity 
in  run-time  despite  having  a  difference  of  one  thread  was  due  to  inefficient  allocation  of 
statistical  tests  to  threads.  The  parameters  used  to  run  this  experiment  can  be  found  in 
Table  4.5 

Table  4.5.  List  of  Parameters  for  Static  Scheduling  Run-time  Collection 


Parameter 

Value 

Test  Suites 

Dieharder-T-OMP-S 

Number  of  Threads 

2,3 

Xtrategy  Mode 

Default  (0) 

KS  Test 

Default  (0) 

Tests 

All-tests  (a) 

Generators 

Mersenne  Twister  (13) 

Multiply  P 

1 

The  run-time  for  individual  statistical  tests  in  2-thread  and  3-thread  Dieharder-T-OMP- 
S  are  shown  in  Figure  4.5.  The  colors  are  used  to  denote  the  order  of  statistical  tests 
within  a  thread.  The  same  color  does  not  represent  the  same  test  in  2-thread  and  3-thread 
Dieharder-T-OMP-S.  Figure  4.5a  shows  that  the  run-time  between  both  threads  are  almost 
equal  and  Figure  4.5b  shows  that  the  run-time  between  all  3  threads  varied  greatly.  Since 
the  run-time  of  the  test  suite  is  determined  by  the  thread  with  the  longest  run-time,  the 
unequal  distribution  of  run-time  load  among  the  threads  led  to  only  2%  reduction  in  total 
run-time  despite  adding  an  additional  thread.  The  experiment  described  in  Section  4.6  was 
conducted  to  determine  if  changing  OpenMP  scheduling  policy  would  improve  the  overall 
performance  regardless  of  thread  counts. 
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Thread  ID  Thread  ID 


0  50  100  150  200  250  300  350  400  450  500 

Run-time  (Seconds) 

(a)  2-Thread  Static  Scheduling 


2 


•I  II  I 

0  50  100  150  200  250  300  350  400  450  500 

Run-time  (Seconds) 

(b)  3-Thread  Static  Scheduling 

Figure  4.5.  Individual  Statistical  Tests  Run-time  for  Individual  Thread 

The  colors  are  used  to  denote  order  of  statistical  tests  run  within  a  thread.  They 
do  not  represent  the  same  tests  within  (a)  and  (b). 
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4.6  OpenMP  Scheduling  Policy 

In  this  experiment,  different  OpenMP  scheduling  policies  explained  in  Section  3.1.2  were 
evaluated  to  determine  their  impact  on  statistical  test  suite  run-time.  Static  and  dynamic 
scheduling  policies  were  executed  using  the  parameters  shown  in  Table  4.6. 


Table  4.6.  List  of  Parameters  for  Scheduling  Policy  Comparison 


Parameter 

Value 

Test  Suites 

Dieharder-T-OMP-S,  Dieharder-T-OMP-D 

Xtrategy  Mode 

Default  (0) 

KS  Test 

Default  (0) 

Tests 

All-tests  (a) 

Generators 

Mersenne  Twister  (13) 

Multiply  P 

1,2,  4,8 

The  run-time  for  both  scheduling  policies  are  shown  in  Figure  4.6.  The  results  show  that 

1- thread  Dieharder-OMP-S  and  1-thread  Dieharder-OMP-D  had  similar  run-time.  This 
was  expected  as  all  tasks  could  only  be  allocated  to  1  thread,  there  should  be  no  difference 
between  static  and  dynamic  scheduling. 

2- thread  Dieharder-OMP-S  had  shorter  run-time  compared  to  2-thread  Dieharder-OMP- 
D.  This  could  be  attributed  to  dynamic  scheduling  having  a  higher  overhead  and  2-thread 
static  scheduling  being  more  efficient  in  this  instance  as  shown  in  Figure  4.5a. 

When  the  thread  count  was  increased  to  3,  Dieharder-OMP-D  performed  slightly  better 
than  Dieharder-OMP-S.  As  seen  from  Figure  4.5b,  3-thread  static  scheduling  was  ineffi¬ 
cient  when  allocating  tasks  to  threads.  The  overhead  incurred  in  dynamic  scheduling  was 
less  costly  compared  to  the  inefficient  allocation  of  tasks. 

When  the  thread  count  was  increased  to  4,  Dieharder-OMP-D  did  not  increase  in  perfor¬ 
mance  compared  to  Dieharder-OMP-S ,  with  both  achieving  similar  run-time.  The  exper¬ 
iment  described  in  Section  4.5  was  conducted  to  determine  why  both  scheduling  polices 
had  similar  performance. 
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Multiply  Psamples  Factor 


■» . .  l-thread  Dieharder-T-OMP-S 

M  2-thread  Dieharder-T-OMP-S 

<  3-thread  Dieharder-T-OMP-S 

4-thread  Dieharder-T-OMP-S 


-*•  ■  -l-thread  Dieharder-T-OMP-D 
•W-  •  •  2-thread  Dieharder-T-OMP-D 
■  •  3-thread  Dieharder-T-OMP-D 
•  4-thread  Dieharder-T-OMP-D 


Figure  4.6.  OMP  Dynamic  Scheduling  Run-time 


4.7  4-thread  Scheduling  Individual  Run-time 

This  experiment  was  performed  to  collect  the  run-time  for  individual  statistical  tests  in  each 
thread  for  4-thread  static  and  dynamic  scheduling.  This  was  to  identify  why  the  run-time 
for  4-thread  static  and  dynamic  scheduling  were  similar  as  described  in  Section  4.6.  The 
parameters  used  to  run  this  experiment  are  shown  in  Table  4.7. 


Table  4.7.  List  of  Parameters  for  4-thread  Individual  Run-time  Collection 


Parameter 

Value 

Test  Suites 

Dieharder-T-OMP-S,  Dieharder-T-OMP-DS 

Number  of  Threads 

4 

Xtrategy  Mode 

Default  (0) 

KS  Test 

Default  (0) 

Tests 

All-tests  (a) 

Generators 

Mersenne  Twister  (13) 

Multiply  P 

1,  2,  4,  8 

As  shown  in  Figure  4.7,  dynamic  scheduling  allocated  statistical  tests  to  threads  very  effi¬ 
ciently.  Dynamic  scheduling  is  good  when  tasks  did  not  have  consistent  run-time,  similar 
to  individual  statistical  tests  having  different  experimental  run-times.  The  disadvantage  of 
dynamic  scheduling  is  that  it  incurred  higher  overhead  as  it  had  to  pause  the  thread  every 
time  it  had  to  allocate  a  new  task.  The  purpose  of  this  experiment  was  determining  if  the 
extra  overhead  incurred  was  less  than  the  time  saved  from  efficient  allocation  of  tasks. 

The  run-times  for  static  scheduling  are  shown  in  Figure  4.7b.  Although  the  same  number 
of  tasks  were  assigned  to  each  thread,  the  time  taken  to  complete  all  tasks  varied  widely 
among  all  four  threads.  The  faster  threads  were  idle  for  43%,  30%  and  13%  for  the  duration 
of  the  experiment,  showing  clear  inefficiency  in  task  allocation. 

Using  both  results,  it  can  be  concluded  that  dynamic  scheduling  was  able  to  allocate  tasks 
efficiently  among  the  thread,  and  static  scheduling  was  able  to  allocate  tasks  to  threads  with 
little  overhead.  It  was  concluded  that  the  most  efficient  way  to  implement  multi-threading 
would  be  to  apply  static  scheduling  to  tasks  that  are  arranged  such  that  tasks  add  up  to 
similar  run-time.  This  approach  would  require  each  statistical  test  to  be  benchmarked  and 
assigned  a  value  indicating  time  required.  This  would  allow  the  program  to  calculate  the 
estimated  run-time  for  each  thread  and  arrange  the  tasks  such  that  each  thread  would  get 
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Thread  ID  Thread  ID 


tasks  adding  up  to  the  estimated  run-time. 


0  50  100  150  200  250  300  350  400 

Run-time  (Seconds) 


(a)  Dynamic  Scheduling 


0  50  100  150  200  250  300  350  400 

Run-time  (Seconds) 


(b)  Static  Scheduling 

Figure  4.7.  Run-time  for  Different  Dieharder-T  OpenMP  Scheduling  Types 

The  colors  are  used  to  denote  order  of  statistical  tests  run  within  a  thread.  They 
do  not  represent  the  same  tests  within  (a)  and  (b). 
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CHAPTER  5: 
Conclusion 


The  primary  contributions  of  this  thesis  are  as  follows: 

•  We  modified  Dieharder  to  support  multi-threading,  creating  a  variant  test-suite 
(. Dieharder-T ). 

•  We  integrated  Gondree’s  thread  pool  library  into  Dieharder-T  and  evaluated  perfor¬ 
mance  of  the  test  suite. 

•  We  integrated  OpenMP  into  Dieharder-T  and  evaluate  performance  of  the  test  suite 
with  both  static  and  dynamic  scheduling. 

Our  experimental  results  show  that  multi-threading  reduced  the  run-time  of  Dieharder. 
Due  to  inefficient  scheduling  in  Dieharder ,  multi-threading  showed  a  decrease  in  run-time 
when  the  number  of  threads  were  increased.  OpenMP  with  static  scheduling  showed  the 
most  consistent  reduction  of  run-time  when  compared  to  Dieharder. 

The  run-time  for  static  scheduling  was  halved  when  the  number  of  threads  were  increased 
to  two.  This  was  achieved  as  the  run-time  between  both  threads  were  similar,  resulting  in 
halving  of  total  run-time.  The  run-time  was  not  halved  further  when  the  numbers  of  threads 
were  increased  to  four  as  there  was  great  disparity  among  thread  run-time. 

The  results  showed  that  dynamic  scheduling  did  not  provide  better  performance  to  static 
scheduling  due  to  the  overheads  incurred  when  allocating  tasks  to  threads.  2-thread  dy¬ 
namic  scheduling  did  not  halve  the  run-time  of  1 -thread  dynamic  scheduling  due  to  the 
overhead  incurred.  The  overhead  appeared  consistent  even  as  number  of  threads  increased, 
resulting  in  declining  performance  when  number  of  threads  were  increased.  This  made 
dynamic  scheduling  unsuitable  to  be  used  for  multi-threading  in  our  PRNG  test  suite. 

Based  on  the  conclusions  described  in  Section  4.7,  efficient  allocation  of  tasks  can  be 
achieved  by  combining  the  advantage  from  both  types  of  scheduling  policies.  This  would 
allow  the  reduction  in  run-time  to  be  more  consistent  when  the  number  of  threads  are  in¬ 
creased.  By  rearranging  the  tasks  such  that  tasks  in  each  thread  add  up  to  similar  run-times, 
the  threads  would  not  idle  for  long.  This  approach  could  use  the  data  from  Appendix  A. 6, 
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assigning  an  estimated  run-time  to  each  statistical  test.  An  additional  sorting  stage  would 
be  required  to  be  added  before  tasks  are  allocated  to  threads  with  static  scheduling.  The 
recommended  change  would  require  an  estimation  of  run-time  to  be  assigned  to  each  sta¬ 
tistical  test  and  a  sorting  stage  to  arrange  the  statistical  tests  for  assignment  to  each  thread. 
This  would  assign  tasks  to  threads  using  static  scheduling  and  achieving  low  variance  of 
run-time  between  threads  similar  to  dynamic  scheduling. 
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APPENDIX:  Experiments  Result 


This  appendix  contains  the  raw  experimental  data  that  was  collected  during  the  experiments 
described  in  Chapter  4.  The  “change  factor”  shown  in  the  tables  below  is  the  change  in  run¬ 
time  when  the  number  of  psample  values  are  doubled. 


A.l  Dieharder  Generators  Run-time 

This  section  contains  the  raw  experimental  data  that  was  collected  for  Dieharder  genera¬ 
tors. 


Table  A.l.  Run-time  for  Mersenne  Twister  Generator 


Multiply_P 

Run  1  (S) 

Run  2  (S) 

Run  3  (S) 

Average  (Hours) 

Change  Factor 

1 

946 

941 

947 

0.26 

- 

2 

1887 

1887 

1887 

0.52 

2.00 

4 

3775 

3778 

3776 

1.05 

2.00 

8 

7549 

7544 

7540 

2.10 

2.00 

Table  A. 2.  Run-time  for  Raw  File  Input  Generator 

Multiply_P 

Run  1  (S) 

Run  2  (S) 

Run  3  (S) 

Average  (Hours) 

Change  Factor 

1 

2590 

3789 

3647 

0.93 

- 

2 

6815 

13607 

13343 

3.13 

3.37 

4 

15340 

32972 

32822 

7.51 

2.40 

8 

33456 

71390 

71525 

16.33 

2.17 
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A.2  Bitstream  Test  Run-time 

This  section  contains  the  raw  experimental  data  that  was  collected  for  Bitstream  test. 


Table  A. 3.  Bitstream  Test  Run-time  for  Xtrategy  Modes 


Multiply_P 

Run  1  (S)  Run  2  (S)  Run  3  (S) 

Average  (S)  Change  Factor 

Default 

1 

1.28 

1.25 

1.25 

1.26 

- 

2 

2.48 

2.48 

2.48 

2.48 

1.97 

4 

4.96 

4.95 

4.95 

4.95 

2.00 

8 

9.91 

9.91 

9.90 

9.91 

2.00 

Resolve  Ambiguity 

1 

1.24 

1.25 

1.25 

1.25 

- 

2 

2.49 

2.49 

2.48 

2.49 

2.00 

4 

4.96 

4.96 

4.87 

4.93 

1.98 

8 

9.76 

9.90 

9.82 

9.82 

1.99 

Test  to  Destruction 

1 

3247 

3139 

3271 

3235 

- 

2 

3258 

3287 

3302 

3282 

1.01 

4 

3268 

3267 

3276 

3277 

1.00 

8 

3259 

3274 

3278 

3273 

1.00 
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A.3  Xtrategy  Modes  Run-time 

This  section  contains  the  raw  experimental  data  that  was  collected  for  different  Xtrategy 
modes. 


Table  A. 4.  “All-Tests”  Run-time  for  Xtrategy  Modes 


Multiply_P 

Run  1  (S) 

Run  2  (S) 

Run  3  (S) 

Average  (Hours) 

Change  Factor 

Default 

1 

1106 

1110 

1106 

0.31 

- 

2 

2216 

2206 

2207 

0.61 

2.00 

4 

4390 

4383 

4417 

1.22 

1.99 

8 

8792 

8752 

8864 

2.45 

2.00 

Resolve  Ambiguity 

1 

1143 

1143 

1150 

0.32 

- 

2 

2209 

2214 

2209 

0.61 

1.93 

4 

4933 

4959 

4923 

1.37 

2.23 

8 

8990 

8958 

8956 

2.49 

1.82 
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A.4  Statistical  Test  Suites  Run-time 

This  section  contains  the  raw  experimental  data  that  was  collected  for  different  statistical 
test  suites. 


Table  A. 5.  Statistical  Test  Suite  Run-time  -  Non-threaded 


Multiply_P 

Run  1  (S) 

Run  2  (S) 

Run  3  (S) 

Average  (Hours) 

Change  Factor 

Dieharder 

1 

1106 

1110 

1106 

0.31 

- 

2 

2216 

2206 

2207 

0.61 

2.00 

4 

4390 

4383 

4417 

1.22 

1.99 

8 

8792 

8752 

8864 

2.45 

2.00 

Dieharder-T  Serial 

1 

923 

1307 

1313 

0.33 

- 

2 

2488 

2108 

1928 

0.60 

1.84 

4 

3828 

4404 

4055 

1.14 

1.88 

8 

7449 

8557 

7964 

2.22 

1.95 

34 


Table  A. 6.  Statistical  Test  Suite  Run-time  -  Dieharder-T 

Multiply_P  Run  1  (S)  Run  2  (S)  Run  3  (S)  Average  (Hours)  Change  Factor 


Dieharder-T  1 -Thread 


1 

2 

4 

8 

975 

1919 

3847 

7672 

973 

1922 

3829 

8714 

973 

1927 

3849 

7677 

0.27 

0.53 

1.07 

2.23 

1.97 

2.00 

2.08 

Dieharder-T  2-Thread 

1 

618 

617 

627 

0.17 

- 

2 

1348 

1252 

1236 

0.36 

2.06 

4 

2485 

2466 

2570 

0.70 

1.96 

8 

5031 

4731 

5038 

1.37 

1.97 

Dieharder-T  3-Thread 

1 

476 

472 

474 

0.13 

- 

2 

954 

952 

951 

0.26 

2.01 

4 

1882 

1881 

2006 

0.53 

2.02 

8 

3950 

4058 

3807 

1.09 

2.05 

Dieharder-T  4-Thread 

1 

463 

476 

466 

0.13 

- 

2 

945 

950 

939 

0.26 

2.02 

4 

1903 

1893 

1894 

0.53 

2.01 

8 

3727 

3812 

3743 

1.04 

1.99 
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Table  A. 7.  Statistical  Test  Suite  Run-time  -  Dieharder-T-OMP-S 


Multiply_P  Run  1  (S)  Run  2  (S)  Run  3  (S)  Average  (Hours)  Change  Factor 


Dieharder-T-OMP-S  1 -Thread 


1 

955 

956 

954 

0.27 

- 

2 

1884 

1888 

1881 

0.52 

1.97 

4 

3774 

3777 

3773 

1.05 

2.00 

8 

7572 

7534 

7541 

2.10 

2.00 

Dieharder-T-OMP-S  2-Thread 


1 

494 

492 

492 

0.14 

- 

2 

970 

966 

965 

0.27 

1.96 

4 

1934 

1935 

1931 

0.54 

2.00 

8 

3857 

3860 

3859 

1.07 

2.00 

Dieharder-T-OMP-S  3-Thread 


1 

500 

493 

462 

0.13 

- 

2 

962 

957 

924 

0.26 

1.95 

4 

1811 

1984 

1809 

0.52 

1.97 

8 

3655 

3991 

3710 

1.05 

2.03 

Dieharder-T-OMP-S  4-Thread 


1 

392 

401 

394 

0.11 

- 

2 

791 

792 

791 

0.22 

2.00 

4 

1567 

1570 

1617 

0.44 

2.00 

8 

3207 

3145 

3201 

0.88 

2.01 
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Table  A. 8.  Statistical  Test  Suite  Run-time  -  Dieharder-T-OMP-D 


Multiply_P  Run  1  (S)  Run  2  (S)  Run  3  (S)  Average  (Hours)  Change  Factor 


Dieharder-T-OMP-D  1 -Thread 


1 

2 

4 

8 

964 

1881 

3801 

7543 

958 

1882 

3790 

7557 

952 

1909 

3790 

7562 

0.27 

0.53 

1.05 

2.10 

1.97 

2.01 

2.00 

Dieharder-T-OMP-D  2-Thread 

1 

565 

557 

552 

0.16 

- 

2 

1102 

1163 

1150 

0.32 

2.04 

4 

2259 

2301 

2300 

0.64 

2.01 

8 

4384 

4434 

4591 

1.24 

1.95 

Dieharder-T-OMP-D  3-Thread 

1 

438 

453 

455 

0.12 

- 

2 

855 

885 

863 

0.24 

1.93 

4 

1709 

1754 

1832 

0.49 

2.03 

8 

3428 

3618 

3734 

1.00 

2.04 

Dieharder-T-OMP-D  4- Thread 


1 

395 

397 

394 

0.11 

- 

2 

788 

792 

790 

0.22 

2.00 

4 

1630 

1479 

1598 

0.44 

1.99 

8 

3190 

3294 

3282 

0.90 

2.07 

A.5  Static  Scheduling  Run-time  Per  Thread 

This  section  contains  the  raw  experimental  data  that  was  collected  for  2-thread  and  3-thread 
static  scheduling. 
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Table  A. 9.  2-Thread  Static  Scheduling  Tests  Run-time  -  Thread  0 


Thread  ID 

Statistical  Test 

ntuple 

Run-time  (Seconds) 

0 

Diehard  Birthday  Test 

0 

1 

0 

Diehard  0PERM5  Test 

0 

4 

0 

Diehard  32x32  Binary  Rank  Test 

0 

14 

0 

Diehard  6x8  Binary  Rank  Test 

0 

2 

0 

Diehard  Bitstream  Test 

0 

1 

0 

Diehard  OPSO  Test 

0 

3 

0 

Diehard  OQSO  Test 

0 

2 

0 

Diehard  DNA  Test 

0 

29 

0 

Diehard  Count  the  l’s  (stream)  Test 

0 

1 

0 

Diehard  Count  the  l’s  (byte)  Test 

0 

2 

0 

Diehard  Parking  Lot  Test 

0 

2 

0 

Diehard  Minimum  Distance  (2d  Circle)  Test 

2 

1 

0 

Diehard  3d  Sphere  (Minimum  Distance)  Test 

3 

4 

0 

Diehard  Squeeze  Test 

0 

2 

0 

Diehard  Sums  Test 

0 

1 

0 

Diehard  Runs  Test 

0 

1 

0 

Diehard  Craps  Test 

0 

4 

0 

Marsaglia  and  Tsang  GCD  Test 

0 

161 

0 

STS  Monobit  Test 

1 

1 

0 

STS  Runs  Test 

2 

10 

0 

STS  Serial  Test 

1 

11 

0 

RGB  Bit  Distribution  Test 

1 

2 

0 

RGB  Bit  Distribution  Test 

2 

2 

0 

RGB  Bit  Distribution  Test 

3 

2 

0 

RGB  Bit  Distribution  Test 

4 

3 

0 

RGB  Bit  Distribution  Test 

5 

3 

0 

RGB  Bit  Distribution  Test 

6 

5 

0 

RGB  Bit  Distribution  Test 

7 

7 

0 

RGB  Bit  Distribution  Test 

8 

10 

0 

RGB  Bit  Distribution  Test 

9 

14 

0 

RGB  Bit  Distribution  Test 

10 

18 

0 

RGB  Bit  Distribution  Test 

11 

26 

0 

RGB  Bit  Distribution  Test 

12 

46 

0 

RGB  Generalized  Minimum  Distance  Test 

2 

6 

0 

RGB  Generalized  Minimum  Distance  Test 

3 

9 

0 

RGB  Generalized  Minimum  Distance  Test 

4 

24 

0 

RGB  Generalized  Minimum  Distance  Test 

5 

51 

0 

RGB  Permutations  Test 

2 

1 

0 

RGB  Permutations  Test 

3 

1 

0 

RGB  Permutations  Test 

4 

2 

Total 

489 
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Table  A. 10.  2-Thread  Static  Scheduling  Tests  Run-time  -  Thread  1 


Thread  ID 

Statistical  Test 

ntuple 

Run-time  (Seconds) 

1 

RGB  Permutations  Test 

5 

5 

1 

RGB  Lagged  Sum  Test 

0 

1 

1 

RGB  Lagged  Sum  Test 

1 

2 

1 

RGB  Lagged  Sum  Test 

2 

3 

1 

RGB  Lagged  Sum  Test 

3 

4 

1 

RGB  Lagged  Sum  Test 

4 

5 

1 

RGB  Lagged  Sum  Test 

5 

5 

1 

RGB  Lagged  Sum  Test 

6 

7 

1 

RGB  Lagged  Sum  Test 

7 

7 

1 

RGB  Lagged  Sum  Test 

8 

8 

1 

RGB  Lagged  Sum  Test 

9 

8 

1 

RGB  Lagged  Sum  Test 

10 

9 

1 

RGB  Lagged  Sum  Test 

11 

11 

1 

RGB  Lagged  Sum  Test 

12 

11 

1 

RGB  Lagged  Sum  Test 

13 

11 

1 

RGB  Lagged  Sum  Test 

14 

12 

1 

RGB  Lagged  Sum  Test 

15 

14 

1 

RGB  Lagged  Sum  Test 

16 

14 

1 

RGB  Lagged  Sum  Test 

17 

15 

1 

RGB  Lagged  Sum  Test 

18 

16 

1 

RGB  Lagged  Sum  Test 

19 

17 

1 

RGB  Lagged  Sum  Test 

20 

17 

1 

RGB  Lagged  Sum  Test 

21 

18 

1 

RGB  Lagged  Sum  Test 

22 

19 

1 

RGB  Lagged  Sum  Test 

23 

19 

1 

RGB  Lagged  Sum  Test 

24 

20 

1 

RGB  Lagged  Sum  Test 

25 

21 

1 

RGB  Lagged  Sum  Test 

26 

22 

1 

RGB  Lagged  Sum  Test 

27 

23 

1 

RGB  Lagged  Sum  Test 

28 

23 

1 

RGB  Lagged  Sum  Test 

29 

24 

1 

RGB  Lagged  Sum  Test 

30 

25 

1 

RGB  Lagged  Sum  Test 

31 

25 

1 

RGB  Lagged  Sum  Test 

32 

26 

1 

RGB  Kolmogorov-Smirnov  Test 

0 

1 

1 

DAB  Byte  Distribution 

0 

2 

1 

DAB  DCT 

256 

1 

1 

DAB  Fill  Tree  Test 

32 

2 

1 

DAB  Fill  Tree  2  Test 

0 

4 

1 

DAB  Monobit  2  Test 

12 

3 

Total 

480 
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Table  A. 11.  3-Thread  Static  Scheduling  Tests  Run-time  -  Thread  0 


Thread  ID 

Statistical  Test 

ntuple  Run-time  (Seconds) 

0 

Diehard  Birthday  Test 

0 

2 

0 

Diehard  OPERM5  Test 

0 

4 

0 

Diehard  32x32  Binary  Rank  Test 

0 

16 

0 

Diehard  6x8  Binary  Rank  Test 

0 

2 

0 

Diehard  Bitstream  Test 

0 

2 

0 

Diehard  OPSO  Test 

0 

3 

0 

Diehard  OQSO  Test 

0 

3 

0 

Diehard  DNA  Test 

0 

31 

0 

Diehard  Count  the  l’s  (stream)  Test 

0 

1 

0 

Diehard  Count  the  1  ’s  (byte)  Test 

0 

1 

0 

Diehard  Parking  Lot  Test 

0 

2 

0 

Diehard  Minimum  Distance  (2d  Circle)  Test 

2 

1 

0 

Diehard  3d  Sphere  (Minimum  Distance)  Test 

3 

4 

0 

Diehard  Squeeze  Test 

0 

3 

0 

Diehard  Sums  Test 

0 

1 

0 

Diehard  Runs  Test 

0 

1 

0 

Diehard  Craps  Test 

0 

5 

0 

Marsaglia  and  Tsang  GCD  Test 

0 

171 

0 

STS  Monobit  Test 

1 

1 

0 

STS  Runs  Test 

2 

20 

0 

STS  Serial  Test 

1 

12 

0 

RGB  Bit  Distribution  Test 

1 

2 

0 

RGB  Bit  Distribution  Test 

2 

2 

0 

RGB  Bit  Distribution  Test 

3 

2 

0 

RGB  Bit  Distribution  Test 

4 

2 

0 

RGB  Bit  Distribution  Test 

5 

6 

0 

RGB  Bit  Distribution  Test 

6 

8 

Total 

308 
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Table  A. 12.  3-Thread  Static  Scheduling  Tests  Run-time  -  Thread  1 


Thread  ID 

Statistical  Test 

ntuple 

Run-time  (Seconds) 

1 

RGB  Bit  Distribution  Test 

7 

10 

1 

RGB  Bit  Distribution  Test 

8 

14 

1 

RGB  Bit  Distribution  Test 

9 

14 

1 

RGB  Bit  Distribution  Test 

10 

21 

1 

RGB  Bit  Distribution  Test 

11 

38 

1 

RGB  Bit  Distribution  Test 

12 

59 

1 

RGB  Generalized  Minimum  Distance  Test 

2 

8 

1 

RGB  Generalized  Minimum  Distance  Test 

3 

15 

1 

RGB  Generalized  Minimum  Distance  Test 

4 

35 

1 

RGB  Generalized  Minimum  Distance  Test 

5 

87 

1 

RGB  Permutations  Test 

2 

1 

1 

RGB  Permutations  Test 

3 

1 

1 

RGB  Permutations  Test 

4 

2 

1 

RGB  Permutations  Test 

5 

6 

1 

RGB  Lagged  Sum  Test 

0 

1 

1 

RGB  Lagged  Sum  Test 

1 

1 

1 

RGB  Lagged  Sum  Test 

2 

4 

1 

RGB  Lagged  Sum  Test 

3 

4 

1 

RGB  Lagged  Sum  Test 

4 

5 

1 

RGB  Lagged  Sum  Test 

5 

5 

1 

RGB  Lagged  Sum  Test 

6 

7 

1 

RGB  Lagged  Sum  Test 

7 

7 

1 

RGB  Lagged  Sum  Test 

8 

7 

1 

RGB  Lagged  Sum  Test 

9 

9 

1 

RGB  Lagged  Sum  Test 

10 

9 

1 

RGB  Lagged  Sum  Test 

11 

11 

1 

RGB  Lagged  Sum  Test 

12 

11 

Total 

392 

41 


Table  A. 13.  3-Thread  Static  Scheduling  Tests  Run-time  -  Thread  2 


Thread  ID 

Statistical  Test 

ntuple  Run-time  (Seconds) 

2 

RGB  Lagged  Sum  Test 

13 

16 

2 

RGB  Lagged  Sum  Test 

14 

16 

2 

RGB  Lagged  Sum  Test 

15 

16 

2 

RGB  Lagged  Sum  Test 

16 

17 

2 

RGB  Lagged  Sum  Test 

17 

16 

2 

RGB  Lagged  Sum  Test 

18 

19 

2 

RGB  Lagged  Sum  Test 

19 

19 

2 

RGB  Lagged  Sum  Test 

20 

19 

2 

RGB  Lagged  Sum  Test 

21 

21 

2 

RGB  Lagged  Sum  Test 

22 

19 

2 

RGB  Lagged  Sum  Test 

23 

23 

2 

RGB  Lagged  Sum  Test 

24 

25 

2 

RGB  Lagged  Sum  Test 

25 

28 

2 

RGB  Lagged  Sum  Test 

26 

29 

2 

RGB  Lagged  Sum  Test 

27 

26 

2 

RGB  Lagged  Sum  Test 

28 

23 

2 

RGB  Lagged  Sum  Test 

29 

24 

2 

RGB  Lagged  Sum  Test 

30 

25 

2 

RGB  Lagged  Sum  Test 

31 

25 

2 

RGB  Lagged  Sum  Test 

32 

27 

2 

RGB  Kolmogorov-Smimov  Test  0 

1 

2 

DAB  Byte  Distribution 

0 

1 

2 

DAB  DCT 

256 

1 

2 

DAB  Fill  Tree  Test 

32 

3 

2 

DAB  Fill  Tree  2  Test 

0 

4 

2 

DAB  Monobit  2  Test 

12 

3 

Total 

446 
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A.6  Static  and  Dynamic  Scheduling  Run-time 

This  section  contains  the  raw  experimental  data  that  was  collected  for  4-thread  static  and 
dynamic  scheduling. 


Table  A. 14.  4-Thread  Static  Scheduling  Tests  Run-time  -  Thread  0 


Thread  ID 

Statistical  Test 

ntuple  Run-time  (Seconds) 

0 

Diehard  Birthday  Test 

0 

1 

0 

Diehard  OPERM5  Test 

0 

4 

0 

Diehard  32x32  Binary  Rank  Test 

0 

14 

0 

Diehard  6x8  Binary  Rank  Test 

0 

2 

0 

Diehard  Bitstream  Test 

0 

1 

0 

Diehard  OPSO  Test 

0 

3 

0 

Diehard  OQSO  Test 

0 

2 

0 

Diehard  DNA  Test 

0 

29 

0 

Diehard  Count  the  l’s  (stream)  Test 

0 

1 

0 

Diehard  Count  the  1  ’s  (byte)  Test 

0 

2 

0 

Diehard  Parking  Lot  Test 

0 

2 

0 

Diehard  Minimum  Distance  (2d  Circle)  Test 

2 

1 

0 

Diehard  3d  Sphere  (Minimum  Distance)  Test 

3 

4 

0 

Diehard  Squeeze  Test 

0 

2 

0 

Diehard  Sums  Test 

0 

1 

0 

Diehard  Runs  Test 

0 

1 

0 

Diehard  Craps  Test 

0 

4 

0 

Marsaglia  and  Tsang  GCD  Test 

0 

161 

0 

STS  Monobit  Test 

1 

1 

0 

STS  Runs  Test 

2 

10 

Total 

287 
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Table  A. 15.  4-Thread  Static  Scheduling  Tests  Run-time  -  Thread  1 


Thread  ID 

Statistical  Test 

ntuple 

Run-time  (Seconds) 

1 

STS  Serial  Test 

1 

11 

1 

RGB  Bit  Distribution  Test 

1 

2 

1 

RGB  Bit  Distribution  Test 

2 

2 

1 

RGB  Bit  Distribution  Test 

3 

2 

1 

RGB  Bit  Distribution  Test 

4 

3 

1 

RGB  Bit  Distribution  Test 

5 

3 

1 

RGB  Bit  Distribution  Test 

6 

5 

1 

RGB  Bit  Distribution  Test 

7 

7 

1 

RGB  Bit  Distribution  Test 

8 

10 

1 

RGB  Bit  Distribution  Test 

9 

14 

1 

RGB  Bit  Distribution  Test 

10 

18 

1 

RGB  Bit  Distribution  Test 

11 

26 

1 

RGB  Bit  Distribution  Test 

12 

46 

1 

RGB  Generalized  Minimum  Distance  Test 

2 

6 

1 

RGB  Generalized  Minimum  Distance  Test 

3 

9 

1 

RGB  Generalized  Minimum  Distance  Test 

4 

24 

1 

RGB  Generalized  Minimum  Distance  Test 

5 

51 

1 

RGB  Permutations  Test 

2 

1 

1 

RGB  Permutations  Test 

3 

1 

1 

RGB  Permutations  Test 

4 

2 

Total 

348 
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Table  A. 16.  4-Thread  Static  Scheduling  Tests  Run-time  -  Thread  2 


Thread  ID 

Statistical  Test 

ntuple 

Run-time  (Seconds) 

2 

RGB  Permutations  Test 

5 

5 

2 

RGB  Lagged  Sum  Test 

0 

1 

2 

RGB  Lagged  Sum  Test 

1 

2 

2 

RGB  Lagged  Sum  Test 

2 

3 

2 

RGB  Lagged  Sum  Test 

3 

4 

2 

RGB  Lagged  Sum  Test 

4 

5 

2 

RGB  Lagged  Sum  Test 

5 

5 

2 

RGB  Lagged  Sum  Test 

6 

7 

2 

RGB  Lagged  Sum  Test 

7 

7 

2 

RGB  Lagged  Sum  Test 

8 

8 

2 

RGB  Lagged  Sum  Test 

9 

8 

2 

RGB  Lagged  Sum  Test 

10 

9 

2 

RGB  Lagged  Sum  Test 

11 

11 

2 

RGB  Lagged  Sum  Test 

12 

11 

2 

RGB  Lagged  Sum  Test 

13 

11 

2 

RGB  Lagged  Sum  Test 

14 

12 

2 

RGB  Lagged  Sum  Test 

15 

14 

2 

RGB  Lagged  Sum  Test 

16 

14 

2 

RGB  Lagged  Sum  Test 

17 

15 

2 

RGB  Lagged  Sum  Test 

18 

16 

Total 

242 
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Table  A. 17.  4-Thread  Static  Scheduling  Tests  Run-time  -  Thread  3 


Thread  ID 

Statistical  Test 

ntuple  Run-time  (Seconds) 

3 

RGB  Lagged  Sum  Test 

19 

17 

3 

RGB  Lagged  Sum  Test 

20 

17 

3 

RGB  Lagged  Sum  Test 

21 

18 

3 

RGB  Lagged  Sum  Test 

22 

19 

3 

RGB  Lagged  Sum  Test 

23 

19 

3 

RGB  Lagged  Sum  Test 

24 

20 

3 

RGB  Lagged  Sum  Test 

25 

21 

3 

RGB  Lagged  Sum  Test 

26 

22 

3 

RGB  Lagged  Sum  Test 

27 

23 

3 

RGB  Lagged  Sum  Test 

28 

23 

3 

RGB  Lagged  Sum  Test 

29 

24 

3 

RGB  Lagged  Sum  Test 

30 

25 

3 

RGB  Lagged  Sum  Test 

31 

25 

3 

RGB  Lagged  Sum  Test 

32 

26 

3 

RGB  Kolmogorov-Smimov  Test  0 

1 

3 

DAB  Byte  Distribution 

0 

2 

3 

DAB  DCT 

256 

1 

3 

DAB  Fill  Tree  Test 

32 

2 

3 

DAB  Fill  Tree  2  Test 

0 

4 

3 

DAB  Monobit  2  Test 

12 

3 

Total 

392 
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Table  A. 18.  4-Thread  Dynamic  Scheduling  Tests  Run-time  -  Thread  0 


Thread  ID 

Statistical  Test 

ntuple  Run-time  (Seconds) 

0 

Diehard  Birthday  Test 

0 

4 

0 

Diehard  OPSO  Test 

0 

3 

0 

Diehard  DNA  Test 

0 

26 

0 

RGB  Bit  Distribution  Test 

1 

13 

0 

RGB  Bit  Distribution  Test 

5 

18 

0 

RGB  Bit  Distribution  Test 

8 

22 

0 

RGB  Bit  Distribution  Test 

11 

46 

0 

RGB  Generalized  Minimum  Distance  Test 

4 

47 

0 

RGB  Lagged  Sum  Test 

0 

1 

0 

RGB  Lagged  Sum  Test 

1 

2 

0 

RGB  Lagged  Sum  Test 

3 

6 

0 

RGB  Lagged  Sum  Test 

5 

8 

0 

RGB  Lagged  Sum  Test 

8 

11 

0 

RGB  Lagged  Sum  Test 

11 

15 

0 

RGB  Lagged  Sum  Test 

14 

17 

0 

RGB  Lagged  Sum  Test 

18 

23 

0 

RGB  Lagged  Sum  Test 

22 

29 

0 

RGB  Lagged  Sum  Test 

26 

32 

0 

RGB  Lagged  Sum  Test 

30 

36 

0 

DAB  Fill  Tree  2  Test 

0 

5 

Total 

364 
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Table  A. 19.  4-Thread  Dynamic  Scheduling  Tests  Run-time  -  Thread  1 


Thread  ID 

Statistical  Test 

ntuple  Run-time  (Seconds) 

1 

Diehard  32x32  Binary  Rank  Test  0 

18 

1 

STS  Monobit  Test 

1 

1 

1 

STS  Runs  Test 

2 

14 

1 

RGB  Bit  Distribution  Test 

2 

10 

1 

RGB  Bit  Distribution  Test 

4 

13 

1 

RGB  Bit  Distribution  Test 

6 

18 

1 

RGB  Bit  Distribution  Test 

9 

23 

1 

RGB  Bit  Distribution  Test 

12 

72 

1 

RGB  Permutations  Test 

2 

1 

1 

RGB  Permutations  Test 

3 

1 

1 

RGB  Permutations  Test 

4 

3 

1 

RGB  Permutations  Test 

5 

7 

1 

RGB  Lagged  Sum  Test 

2 

5 

1 

RGB  Lagged  Sum  Test 

4 

7 

1 

RGB  Lagged  Sum  Test 

6 

10 

1 

RGB  Lagged  Sum  Test 

9 

13 

1 

RGB  Lagged  Sum  Test 

12 

16 

1 

RGB  Lagged  Sum  Test 

15 

21 

1 

RGB  Lagged  Sum  Test 

19 

24 

1 

RGB  Lagged  Sum  Test 

23 

28 

1 

RGB  Lagged  Sum  Test 

27 

34 

1 

RGB  Lagged  Sum  Test 

31 

34 

Total 

373 
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Table  A. 20.  4-Thread  Dynamic  Scheduling  Tests  Run-time  -  Thread  2 


Thread  ID 

Statistical  Test 

ntuple 

Run-time  (Seconds) 

2 

Diehard  OPERM5  Test 

0 

8 

2 

Diehard  Count  the  l’s  (stream)  Test 

0 

1 

2 

Diehard  Count  the  l’s  (byte)  Test 

0 

2 

2 

Diehard  Minimum  Distance  (2d  Circle)  Test 

2 

1 

2 

Diehard  Squeeze  Test 

0 

3 

2 

Diehard  Sums  Test 

0 

1 

2 

Diehard  Runs  Test 

0 

1 

2 

Diehard  Craps  Test 

0 

5 

2 

STS  Serial  Test 

1 

19 

2 

RGB  Bit  Distribution  Test 

3 

18 

2 

RGB  Bit  Distribution  Test 

7 

21 

2 

RGB  Bit  Distribution  Test 

10 

31 

2 

RGB  Generalized  Minimum  Distance  Test 

2 

9 

2 

RGB  Generalized  Minimum  Distance  Test 

3 

17 

2 

RGB  Generalized  Minimum  Distance  Test 

5 

101 

2 

RGB  Lagged  Sum  Test 

17 

22 

2 

RGB  Lagged  Sum  Test 

21 

28 

2 

RGB  Lagged  Sum  Test 

25 

31 

2 

RGB  Lagged  Sum  Test 

29 

35 

2 

RGB  Kolmogorov-Smimov  Test 

0 

2 

2 

DAB  Byte  Distribution 

0 

2 

2 

DAB  DCT 

256 

2 

2 

DAB  Lill  Tree  Test 

32 

4 

2 

DAB  Monobit  2  Test 

12 

4 

Total 

368 

49 


Table  A. 21.  4-Thread  Dynamic  Scheduling  Tests  Run-time  -  Thread  3 


Thread  ID 

Statistical  Test 

ntuple 

Run-time  (Seconds) 

3 

Diehard  6x8  Binary  Rank  Test 

0 

4 

3 

Diehard  Bitstream  Test 

0 

1 

3 

Diehard  OQSO  Test 

0 

4 

3 

Diehard  Parking  Lot  Test 

0 

2 

3 

Diehard  3d  Sphere  (Minimum  Distance)  Test 

3 

4 

3 

Marsaglia  and  Tsang  GCD  Test 

0 

179 

3 

RGB  Lagged  Sum  Test 

7 

10 

3 

RGB  Lagged  Sum  Test 

10 

14 

3 

RGB  Lagged  Sum  Test 

13 

17 

3 

RGB  Lagged  Sum  Test 

16 

23 

3 

RGB  Lagged  Sum  Test 

20 

25 

3 

RGB  Lagged  Sum  Test 

24 

29 

3 

RGB  Lagged  Sum  Test 

28 

34 

3 

RGB  Lagged  Sum  Test 

32 

33 

Total 

379 
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